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Abstract 



This is a companion note to Zinde- Walsh (2010) to clarify and extend 
results on identification in a number of problems that lead to a system of 
convolution equations. Examples include identification of the distribution of 
mismeasured variables, of a nonparametric regression function under Berk- 
son type measurement error, some nonparametric panel data models, etc. 
The reason that identification in different problems can be considered in one 
approach is that they lead to the same system of convolution equations; more- 
over the solution can be given under more general assumptions than those 
usually considered, by examining these equations in spaces of generalized 
functions. An important issue that did not receive sufficient attention is that 
of well-posedness. This note gives conditions under which well-posedness 
obtains, an example that demonstrates that when well-posedness does not 
hold functions that are far apart can give rise to observable arbitrarily close 
functions and discusses misspecification and estimation from the stand-point 
of well-posedness. 



1 Introduction 



The results of this paper apply to a number of econometric problems, includ- 
ing the examples below. 

Example 1. The distribution of a mismeasured variable with another 
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observation. 

See, e.g., reviews of Carroll, Rupert and Stefanski (1995); Chen, Hong 
and Nekipelov (2009); the problem is examined in Cunha, Heckman and 
Schennach (2010). 

Suppose that g is the density of a mismeasured variable, x*, z is observed 
and has density W\\ z — x* +u, where u is measurement /contamination error 
independent of x* with a density, /. Another observation, x, on x* is available: 
x = x* + u x , where u x is not necessarily independent but E(u x \x*, u) — 0. 



Example 2. Errors in variables regression (EIV) model with Berkson 
type measurement error. 

Review Chen, Hong and Nekipelov (2009); examined by Newey (2001), 
univariate case in Schennach (2007) and Zinde- Walsh (2009), multivariate 
Zinde- Walsh (2010). 



X — Ob | 7 



(1) 



Z = X* + u. 



(2) 



Consider 



y = g(x*) + u y ; 



(3) 



X — X I '^'X 7 



(4) 



Z = X* + u. 



(5) 
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Here ©-© provide a regression with z representing a second measurement 
or possibly a given projection onto a set of instruments for the unobserved 
x*. Here y, z or x, y, z are observed; u is a Berkson type measurement error 
independent of z\ u y ,u x have zero conditional (on z and the other errors) 
expectations. Denote w\ = E(y\z), density of measurement error /. 

Example 3. Panel data model with two periods. 

Evdokimov (2010). 

Here let x (or z) represent the observed variable in the first period, and z 
(x) for the second, x* is the nonparametric function m(X, a), where a is the 
idiosyncratic component and the densities are conditional on the same value 
X for the two periods; the same distributional assumptions as in Example 1 
are used. 

The models lead to the same system of convolution equations. All vectors 
are in R d . 

By independence in all cases we get 



For examples 1 and 3 define density of z by f z ; by Xk the kth component 
of the vector x and consider 



g*f = w 1 . 
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Denote the observable E(f z Xk\z) by w 2k , k = 1, ...d. 
For example 2 

E(x k y\z) = E(x* k g(x*)\z) = j \z k - u k )g{z - u)f(u)du. 

Denote here E(x k y\z) by w 2k , k = 1, ...d. 

Thus for all the examples we need to solve the system of convolution 
equations 

g*f = wi, (6) 
Xk9*f = w 2k ,k= l,...d. 

It is advantageous to consider the functions as generalized functions. The 
interest is often in distributions of the unobservables and there is no reason to 
restrict those to be absolutely continuous; density may not necessarily exist 
but can be represented as a generalized derivative of the distribution function 
rather than an ordinary function. Since solving the convolution equations is 
done via Fourier transforms restricting regression functions in Example 2 
to have ordinary Fourier transforms excludes binary choice or polynomial 
regression and can be overcome by using generalized functions. Also, if some 
variables have singular distributions, or if only some variables are subject to 
measurement error, or there is a mass point in the error distribution (e.g. 
in measurement error from surveys with some portion of true responses) 
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convoluting with a generalized 8— function is natural when considering the 
problems in spaces of generalized functions. 

The spaces of generalized functions most relevant for solving these prob- 
lems are the space S' (tempered distributions); space D' and some related 
spaces also play a role in the proofs. See e.g. Zinde-Walsh, 2010 for the 
definitions, discussion and summary of useful properties. 

The next section 2 presents the full solution to system of equations (J6]) 
extending all the results in the current literature. 

Section 3 discusses well-posedness. This is to clarify two issues: under 
what conditions consistent estimation of the identified general model is possi- 
ble and in what sense does a possibly mis-specified parametric model deliver 
valid analysis. The answer hinges on well-posedness of the identification of 
the function g. Well-posedness refers to g depending on the distributions 
of the observed variables in a continuous fashion. Well-posedness does not 
hold if both the function g and the density / are supersmooth (that is their 
Fourier transforms decline exponentially); on the other hand if any one of the 
two is such that the Fourier transform is continuously differentiable and its 
inverse is a regular generalized function (grows no faster than some power), 
then well-posedness in the weak topology of generalized functions obtains; 
for well-posedness in stronger topologies additional conditions need to be 
provided. An example shows that a Gaussian density for both functions 
would lead to violation of well-posedness. Classes of nonparametric models 
that include the Gaussian and that lead to a well-posed problem are defined. 
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Further, the issue of regularizattion is discussed. 



2 The identification result 

Assumption 1. The generalized functions g, f, W\ and W2k, k = 1, d, are 
in the generalized function space S' and are related by (jSJ) • 

Any generalized density functions are generalized derivatives of the dis- 
tribution function and belong to S', convolution equations are defined. For a 
ordinary function, b, e.g. a regression function of example 2 to belong to S' 
it is sufficient that it belong to some class of functions on R d , $(m, V) (with 
m a vector of integers, V a positive constant) where b e $(m, V) if 



Thus if e.g. b grows no faster than a polynomial, it is in S', so that the analysis 
here applies to binary choice and polynomial regression. Convolutions with 
generalized functions from some classes are defined for such functions (as 
discussed in Zinde- Walsh, 2010). For conditional density of Example 3 some 
extra assumptions on the joint density of the regressors are required. 

Consider now Fourier transforms (Ft) : 7 = Ft(g);<p = Ft(f);e, = 
Ft(w). 

Assumption 2. Either or 7 is a continuous function such that it 
satisfies (J7J) . 
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The continuity assumption on the characteristic function is typically made; 
any characteristic function satisfies© . 

By Theorem 1 of Zinde- Walsh 2010 then the following system of equations 
holds in S'. 



Assumption 3. supp(0) 3supp(7) = W, where W is a convex set in R d 
that includes an interior point 0. 

The support assumption is necessary to solve for 7. The interior point in 
the case of characteristic functions is zero with the value of the continuous 
characteristic function equal to 1 at that point. If the system of equations 
involves functions with W having an interior point a / consider shifted 
functions. 

Theorem 1 Under Assumptions 1-3 if 

(a) 7 is continuously differentiable in W, 7(0) = c 



7-0 = 61, 



(8) 



e 2 k, k = l, ...d. 
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with the uniquely defined continuous functions x fc (£) that solve 

Xfc(0 £ i -^2k = 0,fc= 

or 

(b) (f> is continuously differentiable in W , 0(0) = c 

7(0 = 0(0^(0, (io) 

0(C) =exp [ C ^ d k= iMOdC k , 
Jo 

with the uniquely defined continuous functions Xfc(£) £/icrf solve 

e 1 k k - ((ei)' fc - ie 2k ) = 0, fc = 1, d. 

The proof is in proof of Theorem 3 and the Corollary of Zinde- Walsh, 
2010. If support of coincides with support of 7, (and thus the function 
/) is identified. 

The proof is set in the space S' of generalized functions and does not rely 
on existence of densities. The proof of (b) in the univariate case was first 
provided in Zinde- Walsh 2009, correcting the result of Schennach 2007. The 
formula for the density in Cunha et al. 2010 is valid in case (a) and can 
be interpreted in terms of generalized functions ("distributions"). In case 
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(b), though, a different solution is given here. Thus identification requires 
differentiability of either 7 or <fi; when 7 is not differentiable and the result 
in Cunha et al does not hold identification is still possible in case (b). This 
also extends the identification result of Evdokimov 2010. 

3 Well-posedness 

We now consider whether when the distributions of the observables are close 
the unknown functions are also necessarily close. 

A sufficient condition is provided in Zinde- Walsh 2010 (Theorem 4). 
When identification is based on (b) of Theorem 1 here the model class needs 
to be restricted to include only measurement error distributions with _1 in 
$(m, V) for some m, V. Equivalently, when identification is based on (a), the 
sufficient condition is for the class of models to be restricted to those where 
the latent factor distribution is such that 7 _1 G $(m, V). 

These conditions exclude models where both g and / are supersmooth 
with supp(7) unbounded leading to a supersmooth distribution for w±. Al- 
though these conditions are only shown to be sufficient, an example below 
(from Zinde- Walsh 2009 and 2010) demonstrates that a Gaussian distribution 
(that violates these conditions) fails well-posedness in the weak topology of 
generalized functions in S' and therefore in any stronger topology or metric 
(uniform, L ± , etc.). 

Example 4. Consider the function (f)(x) = e~ x2 , x G R. Consider in S 
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the function b n (x) = 



e n if n — - < x < n + -: 

< b n (x) < e~ n iin~l<x <n+l; (11) 
otherwise. 



This b n (x) converges to b(x) = m S'. Indeed for any ip G S 

/oo pn+2/n 
b n (x)t/j(x)dx = / b n (x)ip(x)dx — > 0. 
•oo Jn—2/n 



Now consider e n = e + fo n — >■ e. We s/iow £/ia£ 1 does noi converge in 
S' to e(j)~ l . Such convergence would imply that (e n — e)(f)~ l = b n (j)~ l — >■ in 
S'. 

But the sequence b n (x)(f)(x)~ 1 does not converge. Indeed if it did then 
j b n (x)(j)~ 1 (x)^j(x)dx would converge for any ip e S. But for ip E S such 
that if)(x) = exp(— 

/pn+2/n rn+l/n 
b n (x)e x2 if;(x)dx > / b n (x)e x2 tp(x)dx > e~ n / e x2 ~ x dx 
Jn—2/n Jn—l/n 

> ^ e -2n+(n-l/n)\ 

~ n 

This diverges.M 

Thus, e.g. for the Gaussian distribution there are models with unknown 
functions that are far from each other in S', but that lead to observable 
functions that are arbitrarily close. 
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When the nonparametric identification result is interpreted to support 
possible wider applicability, when estimation is in fact based on a parametric 
model the question arises as to which nonparametric models are close to a 
model misspecified as parametric. This question may be posed e.g. for the 
analysis of Cunha et al 2010 who use Gaussian and mixed Gaussian distri- 
butions in estimation. Is there some meaningful nonparametric class that 
includes the Gaussian where observationally close models imply closeness of 
latent factors? 

Define a class of generalized functions $>(B,A,m,V) C S' for some pos- 
itive constant B and matrix A; a generalized function b G <&(B, A, m, V) if 
there exists a function 6(C) G $(m, V) with support in ||C|| > B such that 
also 6(C)" 1 G $(m, V) and 6-/(||C|| > B) = 6(C) exp (~C'AC) • Note that a lin- 
ear combination of functions in Q(B, A, m, V) belongs to the same class. For 
a sequence of b n G &(B, A, m, V) to converge to zero as generalized functions 
it is necessary that the corresponding b n converge to zero (a.e.). 

Assumption 4. 7 G $(B, A 7 , m, V); (ft G $(-£?, A^, m, V). 

If this assumption holds £1 = 7-0 G &(B, A 7 + A^, 2m, V 2 ) and 62 = "f k ■ 4> 
also is in A 7 + A^, 2m, V 2 ). 

Theorem 2 Under conditions of Theorem 1 and the Assumption 4 applying 
to generalized functions e iiVi , % — 1,2 if e itn — > £, in S', then the corresponding 
solutions 7 n given by (Tj|) or /[Tty) converge to 7 in S'. 

Proof. From the conditions of the theorem r] in = e i>n — converges in 5" 
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to zero. From the nature of the identified solution or fllOp it follows that 
if it can be shown that e^rj. n converges to zero then the problem for the 
distribution of the latent factor is well-posed. But this is indeed the case 
since the exponents cancel, fj. n converges to zero in S' and convergence to 
zero follows by hypocontinuity of the product Si l f). n . ■ 

The values for the tail exponent have to be fixed implying that even a 
slight deviation in the exponent violates the well-posedness condition: there 
are no two different Gaussian distributions in the class. Since an estimated 
Gaussian will differ from the true Gaussian they cannot belong to the same 
non-parametric class thus there is separation between estimation in the para- 
metric Gaussian problem and in the fully general nonparametric specification 
in S'. 

Consider a solution regularized with a weighting function. It is high 
frequency components that cannot be identified in convolution with a su- 
per smooth function and regularization smooths those out. Fix a function 
if> G D. Using this weight on the Fourier transform is equivalent to solving 
convolution equations 

g*f*{Fr 1 {ifj)) = w 1 *(Ft- 1 {iP)); (12) 
x k g*f*(Ft-\il>)) = w 2k *{Ft- 1 {tfj)),k = l,...d. 

As in the proof of Theorem 3 of Zinde- Walsh 2010 for any such if) the 
solution exists because multiplication by a continuous function 7 _1 or <\T X 
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with arbitrary growth at infinity is permitted since support of ip is bounded. 

Schwatz (1964, pp. 271-273) gives a characterization of functions in S' with 
Fourier transform that has bounded support (in a cube \xk\ < C, k — 1, d) 
based on Wiener-Paley theorem. Such a function is a continuous function g 
that can be extended to a entire analytic function G of a complex argument 
and is of exponential type < 2nC, meaning 



log|G(z)l n „ 
lim sup 61 v ; < 2nC. 

\z\^oo \Zi\ + ... + \Zd\ 



Thus as long as g is such a function it can be expressed via the regularized 
solution. As in Schwartz the subspace of all functions of exponential type 
(for any finite C) can also be considered. However, the regularized solutions 
may not come close to a true g that does not belong to this subspace. 
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